Unveiling the landscape of pathomics in personalized immunotherapy for lung cancer: a bibliometric analysis

Background Pathomics has emerged as a promising biomarker that could facilitate personalized immunotherapy in lung cancer. It is essential to elucidate the global research trends and emerging prospects in this domain. Methods The annual distribution, journals, authors, countries, institutions, and keywords of articles published between 2018 and 2023 were visualized and analyzed using CiteSpace and other bibliometric tools. Results A total of 109 relevant articles or reviews were included, demonstrating an overall upward trend; The terms “deep learning”, “tumor microenvironment”, “biomarkers”, “image analysis”, “immunotherapy”, and “survival prediction”, etc. are hot keywords in this field. Conclusion In future research endeavors, advanced methodologies involving artificial intelligence and pathomics will be deployed for the digital analysis of tumor tissues and the tumor microenvironment in lung cancer patients, leveraging histopathological tissue sections. Through the integration of comprehensive multi-omics data, this strategy aims to enhance the depth of assessment, characterization, and understanding of the tumor microenvironment, thereby elucidating a broader spectrum of tumor features. Consequently, the development of a multimodal fusion model will ensue, enabling precise evaluation of personalized immunotherapy efficacy and prognosis for lung cancer patients, potentially establishing a pivotal frontier in this domain of investigation.


Introduction
Lung cancer remains one of the most prevalent malignancies and represents the foremost cause of cancer-related mortality worldwide (1,2), the majority of lung cancers (80-90%) manifest as non-small cell lung cancer (NSCLC), often diagnosed at an advanced stage (65%), potentially with concurrent local or distant metastasis (3).Recent advances in immunotherapy, particularly the use of immune checkpoint inhibitors (ICIs), have shown promising outcomes in enhancing the prognosis of lung cancer patients (4).Nevertheless, not all patients experience the benefits of immunotherapy, highlighting the need for additional research into predictive biomarkers of immune response.These biomarkers, which may include substances, structures, or products of processes within the body, have the potential to facilitate personalized immunotherapy by enabling the monitoring of immune reactions.
Each lung cancer patient undergoes histopathological diagnosis, involving the preparation of biopsy tissues into pathological slides for examination.The traditional preservation method of using wax embedding techniques for pathological slides can now be digitized through computerization, archiving them as digital pathology images.This technological advancement serves as a foundation for applying big data analytics to digital pathology images.Consequently, the field of pathomics has emerged (5).Pathomics entails applying machine learning techniques to extract large-scale, objectively quantifiable, and readily analyzable datasets from digitally scanned pathological tissue images.Consistent with the pathological diagnostic requirements of diseases, morphological features, including size and shape of pathological images, along with multi-dimensional subtle features reflecting potential biological characteristics such as texture features and edge gradient features, are extracted.These features can be utilized for quantitative disease screening, diagnosis, prognosis prediction, and other applications (6).
In this study, CiteSpace (7) was utilized for the inaugural analysis of hotspots and trends in the application of pathomics in lung cancer.The objective is to provide valuable insights for scholars involved in research within this domain.
2 Materials and methods

Data collection
Web of Science Core Collection (WoSCC) database was chosen as the literature retrieval platform.The retrieval period spanned from 2018 to 2023, with the final search conducted on October 20, 2023.Subject terms were exclusively employed as the search method, and the search formula was: TS= ("Pathomics" OR "Pathomics" OR "Digital Pathology" OR "Whole-slide Imaging" OR "Whole Slide Imaging" OR "Computational Pathology") AND TS=("Lung Cancer" OR "Pulmonary Cancer" OR "Carcinoma of Lung" OR "Pulmonary Carcinoma" OR "Cancer of Lung" OR "Bronchogenic Carcinoma" OR "Bronchogenic" OR "Cancer of the Lung" OR "NSCLC" OR "SLC"), document type: Articles or Review Articles; a total of 109 documents were retrieved.

Statistical methods
Export the complete records and referenced bibliographies of the 109 documents retrieved from WoSCC in Text format, comprising 85 articles and 24 reviews.Conduct a comprehensive analysis of the literature using CiteSpace 6.2.R4 (64-bit) Basic, focusing on the country, institution, authorship, keywords, and cited references.The bibliometric online analysis platform, developed by the National Science Library of the Chinese Academy of Sciences, was employed to conduct a visual analysis of historical keywords and national collaborations.

Annual publication volume in WoSCC
A total of 109 matching documents were retrieved, and the overall publication output exhibited a general upward trend, especially reaching a contribution rate of 26.61% in 2021 (Figure 1).The annual average publication output is approximately 21.8 articles.The results indicate a gradual increase in the attention to pathomics research in the context of lung cancer.

Distribution of source journals
The literature selected from the 109 studies on pathomics in the management of lung cancer has been indexed by 146 journals.For the top 10 journals in terms of publication output, detailed information on Journal Citation Reports (JCR) category, publication quantity, impact factor (IF), and their respective contribution percentages is provided in Table 1.

Visualization of collaborations between countries and institutions
Running the CiteSpace software for country analysis resulted in a knowledge graph with 35 nodes and 80 edges (Figure 2).Each

Visualization of author collaborations
Running the CiteSpace software, author analysis resulted in a knowledge graph with 200 nodes and 383 edges (Figure 4).Each circular node represents an author, and the connections between nodes represent collaborative relationships between authors.The thickness of the connections reflects the degree of collaboration.Different colors of nodes represent different time periods.Conducting a co-occurrence analysis on the author team collaboration network based on the literature retrieved from WoSCC, Table 3 is presented, listing the top 5 authors in terms of publication output along with their affiliated institutions in this research field.

Co-occurrence analysis of keywords
Keyword-related analysis, as manifested in the visualization of co-occurrence patterns, is crucial for delineating the research hotspots and frontiers within a given domain.Running the CiteSpace software with author keywords as node types, a cooccurrence network of keywords with 159 nodes and 334 edges was generated (Figure 5).After removing redundant terms that overlap with the search strategy, an analysis of the co-occurrence frequency and centrality values of keywords in this field (Table 4) reveals that the prominent keywords include: deep learning, artificial intelligence (AI), computer-aided diagnosis, tumor microenvironment, feature extraction, image analysis, tumor mutation burden, survival prediction, markov random field, mixture model.Furthermore, Figure 6

Keyword cluster analysis
Keyword cluster analysis involves utilizing the log-likelihood rate (LLR) method to analyze the connection relationships among significant keyword nodes.This method reflects the hot topics within the research domain, with closely connected keywords in a cluster indicating higher research intensity.Larger node values within a cluster signify greater research interest.By examining these clusters, it is possible to predict the developmental patterns and emerging trends in the research field (9).

Cited references
A total of 426 relevant articles were retrieved from WoSCC, accumulating a total of 10,174 citations.The average number of citations per article is 24.The top 10 most cited articles are listed in Table 5. Visual map of author network.

Discussion
Pathomics is an innovative interdisciplinary field that combines digital pathology and AI.The rise of digital pathology has enabled the scanning of whole tissue slides, based on the fundamental principle of digitizing whole-slide images (WSI) using state-ofthe-art whole-slide scanners.This technology can convert standard Hematoxylin-Eosin (H&E) staining glass slides into a digital format (WSI) (20).This allows for detailed spatial exploration of the entire tumor heterogeneity and its most invasive elements.It automatically extracts and classifies histological features, transforming this information into binary data.Finally, the extracted features are processed through sophisticated computer algorithms to perform tasks such as cancer classification and outcome prediction (21).Computational analysis of digitized histological slides through pathomics can extract valuable information.Some research primarily focuses on predicting the prognosis of lung cancer (22), including improving clinical decisions for cancer immunotherapy and exploring biomarkers related to potential benefits from ICIs, such as microsatellite instability (MSI), PD-L1 TPS, and inflammatory genes, among others (23).Another significant research area involves the integration of pathomics with multiple omics disciplines to explore the classification of lung cancer and other related aspects.Alvarez-Jimenez C et al. demonstrated the potential existence of cross-scale correlations between pathomics and CT imaging, which could be used to identify relevant imaging and histopathological features (24).
The escalating demand for personalized cancer treatment necessitates more precise biomarker assessments and quantitative tissue pathology for accurate cancer diagnosis.Pathologists must be equipped with new methodologies and tools to enhance diagnostic sensitivity and specificity, ultimately contributing to more informed and improved treatment decisions (13).Recently, significant success has been achieved in the analysis of medical images using AI due to the rapid advancement of "deep learning" algorithms (16).
Recent breakthroughs in AI hold the promise of significantly changing the way we diagnose and stratify cancer in pathology.Deep learning technology represents a milestone in this transformation, with numerous deep learning architectures applied to pathology-focused research.Various modeling objectives have been pursued, and recent studies demonstrate the application of deep learning in pathology aiming to predict

Rank Keywords
Frequency Centrality   Variation in the number of keywords.Visual map of title keywords network.Visual map of subject categories keywords network.
for tumor-infiltrating lymphocytes (TILs) to predict the response of NSCLC to immune checkpoint inhibitor therapy (36).Additionally, Coudray N, Ocampo PS et al. applied AI to digital pathology slides to predict the presence of mutations in lung adenocarcinoma (37).In summary, the development of these advanced deep learning algorithms enhances the capability of analyzing lung cancer pathology images, assisting pathologists in challenging diagnostic tasks such as tumor identification, metastasis detection, and analysis of the tumor microenvironment.
TME is primarily composed of tumor cells, lymphocytes, stromal cells, macrophages, blood vessels, and other components.The composition of the TME varies based on the relative proportions of its different constituents, and its presence plays a crucial role in the growth and invasion of tumors.
Immune cells within the TME exhibit dual functionson one hand, they identify and destroy tumor cells, while on the other hand, they also promote tumor growth and metastasis (38,39).For instance, immune cells, including T cells, B cells, macrophages, and myeloid-derived suppressor cells, possess the ability to modulate the TME, thereby influencing tumor metastasis and pathological features (40, 41).Tumor Infiltrating Lymphocytes (TILs) in the TME involves a complex network of multiple cell types and cytokines and is a hallmark of immune recognition.Numerous studies have shown that activated CD8 + T cells are the major players Visual map of keywords network.Revisiting neoadjuvant therapy in non-small-cell lung cancer (19) involved in anti-tumor immunity, and in a subset of tumors, cancer cells inhibit the activation of CD8 + cytotoxic T cells through the expression of ligands such as PD-L1 that bind to inhibitory checkpoints, which has been suggested to be an important mechanism of immune escape for cancer cells (42).The expression of PD-L1 on TME immune cells, including myeloid cells (macrophages, dendritic cells) and T cells, appears to correlate more with the ICI response than expression on tumor cells.However, in NSCLC clinical practice, a limitation in histologically characterizing T lymphocyte infiltration is the scarcity of tumor tissue, which has hampered insight into the role of T lymphocytes in influencing the ICI response (43).Tumor-associated macrophages can promote angiogenesis and invasion by secreting cytokines, growth factors, and proteases (44).Cancer-associated fibroblasts (CAF) are pivotal in the formation of organs and the maintenance of tissue structure and function.They also play a significant role in tumor initiation, progression, metastasis, and the development of drug resistance through their potent immunosuppressive capabilities.Activated CAF possess the capability to secrete various substances, including extracellular matrix and vascular endothelial growth factor (VEGF), contributing to the complexity of the TME (45,46).The markers associated with CAF are predominantly linked to T cell immunosuppression, inhibiting the functions of CD8 + T cells and natural killer cells, particularly by secreting various chemokines and cytokines, notably interleukin-6 (IL-6), which leads to suboptimal clinical treatment outcomes.As research into the effects of CAF and the TME on immune cells and the efficacy of cancer immunotherapy advances, scientists can potentially develop novel compounds targeting these mechanisms, thereby offering innovative strategies for immunotherapy (47).It is noteworthy that research indicates a significant impact of the TME on the survival benefits of immunotherapy (48).The presence of immune cells in the TME, including the percentage of CD8 + T cells, can serve as a predictive factor for the effectiveness of immunotherapy (49).The extracellular matrix can influence the mechanisms of tumorigenesis by affecting cell growth, metastasis, and immune evasion through the activation of signaling pathways.Additionally, tumor cells have the capability to release various growth factors, such as tumor growth factor, endothelial growth factor, and VEGF, contributing to the promotion of new blood vessel development (50).Angiogenesis is crucial for providing nutrients and oxygen to tumor cells, ultimately playing a critical role in tumor growth.Therefore, TME plays a crucial role in tumor growth and metastasis.A comprehensive understanding of TME formation, investigating the interplay between immune cells and tumors, and exploring various genetic variations represent the future directions of TME research (51,52).Additionally, selecting targeted therapeutic strategies based on TME subtypes can enhance the effectiveness of cancer treatment.To further emphasize this point, computer-assisted automatic detection of tumor cells in lymph nodes can significantly reduce the false-negative rate, thereby facilitating earlier detection and treatment of lung cancer, improving the accuracy of TNM staging, accelerating the examination process, and reducing the workload of pathologists.Moreover, tumor spread through air spaces (STAS) has been identified as an important clinical factor associated with tumor recurrence and poor prognosis in patient survival.The identification and quantification of STAS require experienced pathologists to perform detailed examinations of entire tissue sections.Therefore, pathological image analysis tools that rapidly and accurately identifies STAS would be useful for pathologists (16).Quantitative characterization of TME and accurate prediction and classification of important TME components are essential for targeted tumor therapy and prognosis assessment (53), necessitating advanced data processing and analysis approaches.
Quantitative characterization of TME involves a crucial step of segmenting different types of tissue substructures and cells from pathological images.This segmentation forms the foundation for various image analysis tasks, including cellular composition, spatial organization, and morphology specific to substructures.Previous studies in oncology primarily focused on tumor cells, overlooking the pivotal role of TME in the initiation and progression of cancer.The TME of lung cancer is primarily composed of tumor cells, lymphocytes, stromal cells, macrophages, blood vessels, and other components.Studies in lung cancer have indicated that TILs are positive prognostic factors, while angiogenesis is negatively associated with survival outcomes.The role of stromal cells in prognosis is complex.Traditional image processing methods encompass feature definition, feature extraction or segmentation.These techniques have been employed to segment lymphocytes and analyze the spatial organization of TILs and stromal cells within the TME (54).Research associated with the quantitative characterization of TME has the potential to predict treatment outcomes and provides insights for the development of targeted therapeutic strategies.Innovative studies in immunotherapy, in particular, heavily rely on understanding the interactions among various components within the TME and the mechanisms of immune evasion.
Accurate characterization of specific structures and features of TME is crucial for evaluating tumor prognosis (55), enhancing clinical decisions, and advancing precision medicine.Radiomics can unveil the heterogeneity of tumor cells and TME, while genomics and pathomics explore the biological significance of imaging histological features.The integration of these three approaches contributes to a comprehensive understanding and decoding of TME characteristics in tumors, facilitating prognostic predictions (56).The interconnection between radiomics, pathomics, and genomics contributes to establishing and deepening our understanding of cancer biology and imaging features.Concurrently, powerful machine learning techniques can decipher the complex interactions between tumors and cancer treatments.The integration of machine learning technologies with digital imaging and novel methods for assessing TME at the molecular level significantly enhances our comprehension of TME and cancer prognosis assessment.Vanguri RS et al. employed machine learning to integrate multimodal features into a risk prediction model (57).By combining radiological, histopathological, and genomic features, they assessed the predictive capability of immunotherapy response in NSCLC.Their study revealed that the AUC value of the multimodal model was 0.80, surpassing any single variable.These findings establish a quantitative foundation for enhancing the accuracy of predicting immunotherapy response in NSCLC patients through the integration of multimodal features and machine learning.
Simultaneously, the quantitative characterization of TME in lung cancer poses certain challenges, including the following aspects: (1) Complexity and heterogeneity of lung cancer TME composition: In addition to the mentioned cell types, other structures such as bronchi, cartilage, and pleura often appear in pathological sections of the lung.This complexity and heterogeneity make segmentation and traditional feature definition challenging.
(2) Cellular spatial organization (e.g., spatial distribution and interactions of different cell types): While playing a crucial role in TME, it is more challenging to capture than simply providing the quantity or ratio of different cell types.Current research mainly focuses on the proportion of different cell types, overlooking the intricate cellular spatial organization, which may result in limited and contradictory outcomes regarding the roles of different cell types in the TME.(3) For H&E-stained glass slides, there can be significant color variations based on staining conditions and the time gap between slide preparation and scanning.Traditional image processing methods based on manual feature extraction struggle to overcome these obstacles.(4) Multi-omics studies face the high dimensionality and heterogeneity of data, and integrating quantitative measurements of multi-modal data for prognosis prediction is a highly challenging task.In summary, pathomics, as a nascent research methodology, is presently undergoing preliminary investigation.Future studies utilizing extensive multiomics datasets have the potential to advance the formulation of sophisticated integration strategies.These strategies would facilitate a more exhaustive evaluation, characterization, and elucidation of TME (58).Consequently, this advancement will yield profound insights into the imaging characteristics and the pathophysiological and biological underpinnings of tumor pathology.
In recent years, amidst the high incidence and mortality rates of lung cancer, the selection and implementation of treatment plans for advanced-stage lung cancer patients, as well as the creation of more precise platforms for predicting treatment responses, continue to face challenges.Pathomics not only synergizes with traditional pathological semantic information and clinical data to discover disease patterns but also interacts and integrates with various omics information, leveraging the unique advantages of each omics discipline.The development of these interdisciplinary approaches not only aids in identifying subtle lesions that may escape the naked eye and uncovering disease patterns beyond subjective judgment but also facilitates relatively objective and accurate assistance in disease screening, diagnosis, differential diagnosis, and prognosis assessment.Furthermore, it contributes to saving human and material resources, optimizing the utilization of limited medical resources to the maximum extent, and, on a broader scale, promoting the development of the personalized immune intervention.

Conclusion
In conclusion, this study systematically analyzed the literature on pathomics in the management of lung cancer indexed within the WoSCC.It offers an initial overview of recent research trends and forecasts potential hotspots and frontiers for future inquiry, aiming to provide valuable insights and references for scholars and researchers involved in personalized immunotherapy efficacy and prognosis for lung cancer.

FIGURE 1
FIGURE 1Annual analysis of the number of articles issued.
illustrates the temporal frequency changes of different keywords over time.It highlights the research focal points in the past few years related to the application of AI-based pathomics in the diagnosis and treatment of lung cancer.These themes reflect the proactive role of pathomics in aiding diagnosis, classification, predicting treatment efficacy, risk assessment, exploring emerging biomarkers, and analyzing gene expression levels in the context of lung cancer diagnosis and treatment.

FIGURE 3
FIGURE 3Proportion of national contribution.

FIGURE 5
FIGURE 5Visual map of author keywords.

FIGURE 7
FIGURE 7Visual map of author-generated keywords network.
application of deep learning technology in histopathological tissue slices (deep pathomics) with the aim of predicting the response of stage III NSCLC to treatment (33).They assessed 35 digitalized tissue slices (biopsy or surgical specimens) from patients with stage IIIA or IIIB NSCLC.Based on the reduction in target volume observed in weekly CT scans during chemoradiotherapy, patients were categorized as responders (12/35, 34.7%) and non-responders (23/35, 65.7%).Employing a leave-two-out cross-validation method, they tested the digital tissue slices using 5 pre-trained CNNs-AlexNet, VGG, MobileNet, GoogLeNet, and ResNet, and evaluated the network performance.GoogLeNet was identified as the most effective CNN, accurately classifying 8/12 responders and 10/11 non-responders.Furthermore, deep pathomics exhibited a high level of specificity (True Negative Rate: 90.1) and considerable sensitivity (True Positive Rate: 0.75).Their data suggests that AI can surpass the capabilities of current diagnostic systems, providing additional insights beyond what is currently attainable in clinical practice.Furthermore, there are studies attempting to apply AI to histological images with the aim of discovering novel image-based prognostic and predictive biomarkers.Cao R et al. proposed a deep learning model based on histopathological images to predict microsatellite status, achieving area under curve (AUC) of 0.88 and 0.85, respectively.It is noteworthy that this model can identify five distinct pathological imaging features, which are associated with the mutation burden in the genome, DNA damage repairrelated genotypes, and the anti-tumor immune activation pathway in the transcriptome.The predictive model provides the potential for multi-omics correlations through interpretability associated with pathology, genomics, and transcriptomics phenotypes (34).Wang X et al. developed a system capable of identifying high-risk recurrence in early-stage NSCLC patients with an accuracy ranging from 75% to 82% (22).In another study, Wang S et al. characterized a group of high-risk NSCLC patients and identified image-based tumor shape features as an independent prognostic factor (35). Rakaee M et al. developed a machine learning-based scoring system

TABLE 1
Top 10 journals in terms of publication volume.

TABLE 2
Top 5institutions in terms of publication volume.

TABLE 3
Top 5authors in terms of publication volume.

TABLE 4
High frequency and centrality keywords.

TABLE 5
The top 10 cited articles.